Product Code Database
Example Keywords: underclothes -mobile $19-126
   » » Wiki: Power Iteration
Tag Wiki 'Power Iteration'.
Tag

Power iteration
 (

 C O N T E N T S 

In , power iteration (also known as the power method) is an eigenvalue algorithm: given a matrix A, the algorithm will produce a number \lambda, which is the greatest (in absolute value) of A, and a nonzero vector v, which is a corresponding of \lambda, that is, Av = \lambda v. The algorithm is also known as the Von Mises iteration.Richard von Mises and H. Pollaczek-Geiringer, Praktische Verfahren der Gleichungsauflösung, ZAMM - Zeitschrift für Angewandte Mathematik und Mechanik 9, 152-164 (1929).

Power iteration is a very simple algorithm, but it may converge slowly. The most time-consuming operation of the algorithm is the multiplication of matrix A by a vector, so it is effective for a very large with appropriate implementation. The speed of convergence is like (\lambda_2 / \lambda_1)^k where k is the number of iterations (see a later section). In words, convergence is exponential with base being the .


The method
The power iteration algorithm starts with a vector b_0, which may be an approximation to the dominant eigenvector or a random vector. The method is described by the recurrence relation
b_{k+1} = \frac{Ab_k}{\|Ab_k\|}
So, at every iteration, the vector b_k is multiplied by the matrix A and normalized.

If we assume A has an eigenvalue that is strictly greater in magnitude than its other eigenvalues and the starting vector b_0 has a nonzero component in the direction of an eigenvector associated with the dominant eigenvalue, then a subsequence \left( b_{k} \right) converges to an eigenvector associated with the dominant eigenvalue.

Without the two assumptions above, the sequence \left( b_{k} \right) does not necessarily converge. In this sequence,

b_k = e^{i \phi_k} v_1 + r_k,
where v_1 is an eigenvector associated with the dominant eigenvalue, and \| r_{k} \| \rightarrow 0. The presence of the term e^{i \phi_{k}} implies that \left( b_{k} \right) does not converge unless e^{i \phi_{k}} = 1. Under the two assumptions listed above, the sequence \left( \mu_{k} \right) defined by
\mu_{k} = \frac{b_{k}^{*}Ab_{k}}{b_{k}^{*}b_{k}}
converges to the dominant eigenvalue (with Rayleigh quotient).

One may compute this with the following algorithm (shown in Python with NumPy):

  1. !/usr/bin/env python3

import numpy as np

def power_iteration(A, num_iterations: int):

   # Ideally choose a random vector
   # To decrease the chance that our vector
   # Is orthogonal to the eigenvector
   b_k = np.random.rand(A.shape[1])
     

   for _ in range(num_iterations):
       # calculate the matrix-by-vector product Ab
       b_k1 = np.dot(A, b_k)
     

       # calculate the norm
       b_k1_norm = np.linalg.norm(b_k1)
     

       # re normalize the vector
       b_k = b_k1 / b_k1_norm
     

   return b_k
     

power_iteration(np.array([0.5,, 0.2,]), 10) The vector b_k converges to an associated eigenvector. Ideally, one should use the Rayleigh quotient in order to get the associated eigenvalue.

This algorithm is used to calculate the Google .

The method can also be used to calculate the (the eigenvalue with the largest magnitude, for a square matrix) by computing the Rayleigh quotient

\rho(A) = \max \left \{ |\lambda_1|, \dotsc, |\lambda_n| \right \} = \frac{b_k^\top A b_k}{b_k^\top b_k}.


Analysis
Let A be decomposed into its Jordan canonical form: A=VJV^{-1}, where the first column of V is an eigenvector of A corresponding to the dominant eigenvalue \lambda_{1}. Since , the dominant eigenvalue of A is unique, the first Jordan block of J is the 1 \times 1 matrix \lambda_1, where \lambda_{1} is the largest eigenvalue of A in magnitude. The starting vector b_0 can be written as a linear combination of the columns of V:

b_{0} = c_{1}v_{1} + c_{2}v_{2} + \cdots + c_{n}v_{n}.

By assumption, b_{0} has a nonzero component in the direction of the dominant eigenvector, so c_{1} \ne 0.

The computationally useful recurrence relation for b_{k+1} can be rewritten as:

b_{k+1}=\frac{Ab_{k}}{\|Ab_{k}\|}=\frac{A^{k+1}b_{0}}{\|A^{k+1}b_{0}\|},

where the expression: \frac{A^{k+1}b_{0}}{\|A^{k+1}b_{0}\|} is more amenable to the following analysis.

\begin{align}
b_k &= \frac{A^{k}b_{0}}{\| A^{k} b_{0} \|} \\
   &= \frac{\left( VJV^{-1} \right)^{k} b_{0}}{\|\left( VJV^{-1} \right)^{k}b_{0}\|} \\
   &= \frac{ VJ^{k}V^{-1} b_{0}}{\| V J^{k} V^{-1} b_{0}\|} \\
   &= \frac{ VJ^{k}V^{-1} \left( c_{1}v_{1} + c_{2}v_{2} + \cdots + c_{n}v_{n} \right)}{\| V J^{k} V^{-1} \left( c_{1}v_{1} + c_{2}v_{2} + \cdots + c_{n}v_{n} \right)\|} \\
   &= \frac{ VJ^{k}\left( c_{1}e_{1} + c_{2}e_{2} + \cdots + c_{n}e_{n} \right)}{\| V J^{k} \left( c_{1}e_{1} + c_{2}e_{2} + \cdots + c_{n}e_{n} \right) \|} \\
   &= \left( \frac{\lambda_{1}}
\right)^{k} \frac{c_{1}}
\frac{ v_{1} + \frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} \left( c_{2}e_{2} + \cdots + c_{n}e_{n} \right)}{ \left \| v_{1} + \frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} \left( c_{2}e_{2} + \cdots + c_{n}e_{n} \right) \right \| }
\end{align}

The expression above simplifies as k \to \infty

\left( \frac{1}{\lambda_{1}} J \right)^{k} =
\begin{bmatrix} 1 & & & & \\ & \left( \frac{1}{\lambda_{1}} J_{2} \right)^{k}& & & \\ & & \ddots & \\ & & & \left( \frac{1}{\lambda_{1}} J_{m} \right)^{k} \\ \end{bmatrix} \rightarrow \begin{bmatrix} 1 & & & & \\ & 0 & & & \\ & & \ddots & \\ & & & 0 \\ \end{bmatrix} \quad \text{as} \quad k \to \infty.

The limit follows from the fact that the eigenvalue of \frac{1}{\lambda_{1}} J_{i} is less than 1 in magnitude, so

\left( \frac{1}{\lambda_{1}} J_{i} \right)^{k} \to 0 \quad \text{as} \quad k \to \infty.

It follows that:

\frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} \left( c_{2}e_{2} + \cdots + c_{n}e_{n} \right) \to 0 \quad \text{as} \quad k \to \infty

Using this fact, b_{k} can be written in a form that emphasizes its relationship with v_{1} when k is large:

\begin{align}
b_k &= \left( \frac{\lambda_{1}}
\right)^{k} \frac{c_{1}}
\frac{v_{1} + \frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} \left( c_{2}e_{2} + \cdots + c_{n}e_{n} \right)}{\left \| v_{1} + \frac{1}{c_{1}} V \left( \frac{1}{\lambda_1} J \right)^{k} \left( c_{2}e_{2} + \cdots + c_{n}e_{n} \right) \right \| } \\6pt
   &= e^{i \phi_{k}} \frac{c_{1}}
\frac{v_{1}}{\|v_{1}\|} + r_{k}
\end{align}

where e^{i \phi_{k}} = \left( \lambda_{1} / |\lambda_{1}| \right)^{k} and \| r_{k} \| \to 0 as k \to \infty

The sequence \left( b_{k} \right) is bounded, so it contains a convergent subsequence. Note that the eigenvector corresponding to the dominant eigenvalue is only unique up to a scalar, so although the sequence \left(b_{k}\right) may not converge, b_{k} is nearly an eigenvector of A for large k.

Alternatively, if A is , then the following proof yields the same result

Let λ1, λ2, ..., λ m be the m eigenvalues (counted with multiplicity) of A and let v1, v2, ..., v m be the corresponding eigenvectors. Suppose that \lambda_1 is the dominant eigenvalue, so that |\lambda_1| > |\lambda_j| for j>1.

The initial vector b_0 can be written:

b_0 = c_{1}v_{1} + c_{2}v_{2} + \cdots + c_{m}v_{m}.

If b_0 is chosen randomly (with uniform probability), then c1 ≠ 0 with . Now,

\begin{align}
A^{k}b_0 &= c_{1}A^{k}v_{1} + c_{2}A^{k}v_{2} + \cdots + c_{m}A^{k}v_{m} \\ &= c_{1}\lambda_{1}^{k}v_{1} + c_{2}\lambda_{2}^{k}v_{2} + \cdots + c_{m}\lambda_{m}^{k}v_{m} \\ &= c_{1}\lambda_{1}^{k} \left( v_{1} + \frac{c_{2}}{c_{1}}\left(\frac{\lambda_{2}}{\lambda_{1}}\right)^{k}v_{2} + \cdots + \frac{c_{m}}{c_{1}}\left(\frac{\lambda_{m}}{\lambda_{1}}\right)^{k}v_{m}\right) \\ &\to c_{1}\lambda_{1}^{k} v_1 && \left |\frac{\lambda_j}{\lambda_1} \right | < 1 \text{ for } j>1 \end{align}

On the other hand:

b_k = \frac{A^k b_0}{\|A^kb_0\|}.

Therefore, b_k converges to (a multiple of) the eigenvector v_1. The convergence is geometric, with ratio

\left| \frac{\lambda_2}{\lambda_1} \right|,

where \lambda_2 denotes the second dominant eigenvalue. Thus, the method converges slowly if there is an eigenvalue close in magnitude to the dominant eigenvalue.


Applications
Although the power iteration method approximates only one eigenvalue of a matrix, it remains useful for certain computational problems. For instance, uses it to calculate the of documents in their search engine, and uses it to show users recommendations of whom to follow.Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Bosagh Zadeh WTF: The who-to-follow system at Twitter, Proceedings of the 22nd international conference on World Wide Web The power iteration method is especially suitable for , such as the web matrix, or as the matrix-free method that does not require storing the coefficient matrix A explicitly, but can instead access a function evaluating matrix-vector products Ax. For non-symmetric matrices that are well-conditioned the power iteration method can outperform more complex Arnoldi iteration. For symmetric matrices, the power iteration method is rarely used, since its convergence speed can be easily increased without sacrificing the small cost per iteration; see, e.g., Lanczos iteration and .

Some of the more advanced eigenvalue algorithms can be understood as variations of the power iteration. For instance, the inverse iteration method applies power iteration to the matrix A^{-1}. Other algorithms look at the whole subspace generated by the vectors b_k. This subspace is known as the . It can be computed by Arnoldi iteration or Lanczos iteration. is a super-linear and deterministic method to compute the largest eigenpair.


See also
  • Rayleigh quotient iteration
  • Inverse iteration

Page 1 of 1
1
Page 1 of 1
1

Account

Social:
Pages:  ..   .. 
Items:  .. 

Navigation

General: Atom Feed Atom Feed  .. 
Help:  ..   .. 
Category:  ..   .. 
Media:  ..   .. 
Posts:  ..   ..   .. 

Statistics

Page:  .. 
Summary:  .. 
1 Tags
10/10 Page Rank
5 Page Refs